International Journal of Electrical and **Electronics Engineering Research (IJEEER)** ISSN(P): 2250-155X; ISSN(E): 2278-943X Vol. 4, Issue 1, Feb 2014, 143-150

© TJPRC Pvt. Ltd.

DESIGN OF 6 TAP FIR FILTER USING VLSI FOR LOW POWER MAC

ASHISH B. KHARATE<sup>1</sup> & P. R. GUMBLE<sup>2</sup>

<sup>1</sup>Department of Electronics and Telecommunication, Sant Gadge Baba Amravati University,

Amravati, Maharashtra, India

<sup>2</sup>Professor, Department of Engineering and Technology, Sipna College Amravati, Maharashtra, India

ABSTRACT

In the majority of digital signal processing (DSP) applications the critical operations are the multiplication and accumulation. Multiplier-Accumulator (MAC) unit that consumes low power is always a key to achieve a high performance digital signal processing system. Finite impulse response (FIR) filters are widely used in various DSP applications. The purpose of this work is to design and implementation of Finite impulse response (FIR) filter using a low

power MAC unit with clock gating and pipelining techniques to save power.

KEYWORDS: MAC, Low Power, Glitch Reduction, Clock

INTRODUCTION

Finite impulse response (FIR) filters are widely used in various DSP applications. This paper describes an approach to the implementation of low power digital FIR filter based on field programmable gate arrays (FPGAs). The advantages of the FPGA approach to digital filter implementation include higher sampling rates than are available from traditional DSP chips, lower costs than an ASIC for moderate volume applications, and more flexibility than the alternate approaches. Firstly, a single MAC unit is designed, with appropriate geometries that give optimized power, area and delay. Similarly, the N no. of MAC units are designed and controlled for low power using a control logic that enables the each stage at appropriate time. Multiply -Accumulator unit has become one of the essential building blocks in digital signal processing applications such as digital filtering, speech processing, Video coding and cellular phone.

Project also investigate on various architectures of multipliers and adders which are suitable for implementation of high throughput signal processing and at the same time to achieve low power consumption. It is seen by above results that Latch based design can reduce the dynamic power consumption by 92% and pipelining reduces that up to 95%.

MULTIPLY-ACCUMULATE UNITS

A variety of approaches to the implementation of the multiplication and addition portions of the MAC function are possible. A conventional MAC unit consists of multiplier and an accumulator that contains the sum of the previous consecutive products. The structure of MAC unit is illustrated in Figure 1. It consists of multiplying 2 values, then adding the result to the previously accumulated value, which must then be Restored in the registers for future accumulations. The function of the MAC unit is given by the following equation

In computing, especially digital signal processing, the multiply-accumulate operation is a common step that computes the product of two numbers and adds that product to an accumulator. The hardware unit that performs the operation is known as a multiplier accumulator (MAC or MAC unit); the operation itself is also often called a MAC or a MAC operation. The MAC operation modifies an accumulator a:

144 Ashish B. Kharate & P. R. Gumble

$$a \leftarrow a + (b \times c)$$

Modern computers may contain a dedicated MAC, consisting of a multiplier implemented in <u>combinational logic</u> followed by an <u>adder</u> and an accumulator register that stores the result. The output of the register is fed back to one input of the adder, so that on each clock cycle, the output of the multiplier is added to the register. Combinational multipliers require a large amount of logic, but can compute a product much more quickly than the <u>method of shifting and adding</u> typical of earlier computers. The first processors to be equipped with MAC units were <u>digital signal processors</u>, but the technique is now also common in general-purpose processors.

We proposed design methodology for the structure of MAC unit which is extended to handle two's complement multiplication in Figure 1. The major component of this signed MAC unit is Sign multiplier, Sign adder, and Multiplexer and XOR gate. We choose 12 bit precision input bus with along with this we add one extra sign bit so in total at input side 13 bit is applied and output is 31 bit precision.



Figure 1: Basic Structure of Signed MAC Unit

# 6 TAP FIR FILTER



Figure 2: Basic Structure of 6 tap FIR Filter

In this work we have proposed a design of 6 tap FIR filter as shown in figure 2. From this figure the input is delayed and given to multiplier each multiplier gives products corresponding to different filter coefficients and all these products are accumulated and give FIR filter output. We used some coefficient from matlab and suitably convert these values into binary for input to design filter else we can give any coefficient to this filter.

### POWER CONSUMPTION

A limiting factor in many modern DSP systems is the power consumption. This is due to two different problems. First, when the systems becomes larger and whole systems are integrated on a single chip, i.e., System-on-Chip (SoC), and

the clock frequency is increased, the total power dissipation is approaching the limit when an expensive cooling system is required to avoid overheated chips. Second, the portable equipment such as cellular phones and portable computers are becoming increasingly popular. These products use batteries as their power supply. A decrease in power consumption increases the portability since smaller batteries can be used with longer life-time between recharges. Hence, design for low power consumption is important.

#### DYNAMIC POWER DISSIPATIONS

Dynamic power makes up a large portion of the total amount of power consumed by an FPGA design. In CMOS circuits, the dominant source of power dissipation is the dynamic power dissipation. Whenever the logic level changes at different points in the circuit because of the change in the input signals the dynamic power dissipation occurs. Dynamic power is determined by the following equation.

$$P_n = \alpha C V^2 f$$

Where alpha is the switching activity factor, C is the capacitance, V is the supply voltage, and f is the clock frequency. In addition to voltage and physical capacitance, switching activity also influences dynamic power consumption. A chip may contain an enormous amount of physical capacitance, but if there is no switching in the circuit, then no dynamic power will be consumed. The data activity determines how often this switching occurs.

### LOW-POWER DESIGNS

Design for low power has become increasingly important in a wide variety of applications, including digital signal processing, mobile computing, high performance computing, and high-speed networking. The power reduction is achieved through the usage of a MAC unit inside the filters that reduce the total activity and therefore the dynamic power. Above equation shows that the dynamic power consumption is proportional to switching activity. Therefore, minimizing switching activity can effectively reduce the power dissipation without impacting the circuit performance.

The activity can be reduced with different methods and at different levels.

### **CLOCK GATING**

Low-power techniques are essential in modern VLSI design due to the continuous increase of clock frequency and chip complexity. Various recently proposed techniques yield low power operation reducing signals switching activity. Such techniques are generally applied to internal nodes with high capacitive load that heavily contribute to total power dissipation. In particular, the clock system, composed by flip-flops and clock distribution network, is one of the most power consuming subsystems in a VLSI circuit. As a consequence many techniques have been proposed to reduce clock system power dissipation

#### **LATCH-BASED DESIGN**

In some applications, latch-based designs are preferred to D Flip Flop (DFF)-based designs. The basic concept is that a DFF can be split into two latches, and each one is clocked with an independent clock signal. [14]

The two clocks are nonoverlapping clocks as presented in Figure 3. Combinational network is usually inserted between the two latches to build a pipelined data path. The main advantage is that this kind of design supports greater clock skew before failing than a similar DFF-based design. The second advantage is that time borrowing is achieved naturally in the pipelined data path.

146 Ashish B. Kharate & P. R. Gumble

## CLOCK GATING OF LATCH BASED DESIGN

Latch-based designs provide several advantages over single clock master-slave Flip-Flop designs. The constraint with respect to the clock skew can be relaxed for both the clk1 and clk2 clock trees. This allows the synthesizer and router to use smaller clock buffers and to simplify the clock tree generation, which will reduce the power consumption of the clock tree.



Figure 3: Clock Gating of Latched Based Design

### **PIPELINING**

While FPGAs provide flexibility for performing high-performance DSP functions, they consume a significant amount of power. For arithmetic circuits, a large portion of the dynamic power is wasted on un-productive signal glitches. Pipelining can be used to signicantly reduce the unproductive power wasted in signal glitches. Previous studies have shown that power dissipation caused by glitching can make up a significant amount of total dissipated power. An important technique for reducing FPGA power consumption is to reduce the amount of signal glitching within the circuit. Pipelining is one technique for reducing signal glitches. Previous studies have shown that pipelining can be used to reduce power by 90%. A pipelined design has less logic between registers and therefore is less prone to glitching.

### ACTIVE HDL SCHEMATIC



Figure 4: Active HDL Simulation of 6 Tap Sequential FIR Digital Filter



Figure 5: Active HDL Simulation of 6 Tap Latch based FIR Digital Filter



Figure 6: Active HDL Simulation of 6 Tap Pipelined FIR Digital Filter

## SIMULATION AND RESULTS

A FIR filter scheme suitable for unsigned and signed computations is presented in this paper. Low power designs for 6 Tap FIR filter using latch based and pipelining techniques are implemented. These filters are designed using MATLAB and developed VHDL code. Simulation is performed using Active-HDL and functional verification is carried out using Altera Quartus II and, FPGA implementation on Cyclone. Figure 9 shows simulation result performed in Active-HDL for 6 tap FIR filter. Simulation result of latch based FIR filter is shown in figure 8 Whereas figure 6 shows simulation result of pipelined FIR filter in Active HDL. Using finite state machine input values and filter coefficients have given to this digital FIR filter.



Figure 7: Simulation Results for Sequential 6 Tap FIR Filter



Figure 8: Simulation Results for Latch Based 6 Tap FIR Filter



Figure 9: Simulation Results for 6 Tap Pipelined FIR Filter

148 Ashish B. Kharate & P. R. Gumble

## **Power Analyzer Summary**

**Table 1: Power Analyzer Summary** 

| 6 Tap FIR Filter   | Core Dynamic Thermal<br>Power Dissipation |  |
|--------------------|-------------------------------------------|--|
| sequential Filter  | 21.40mW                                   |  |
| Latch Based Filter | 1.73mW                                    |  |
| Pipelined Filter   | 1.09mW                                    |  |

It can be seen that dynamic power consumption is decreased through the use of two techniques; latch based clock gating and pipelining of original 6 tap FIR filter as shown in table 1. The proposed FIR filters have been synthesized and implemented using Altera Quartus II FPGA and power is analyzed using Power Play Power Analyzer Tool.

## **CONCLUSIONS**

While FPGAs provide flexibility for performing high performance DSP functions, they consume a significant amount of power. Often, a large portion of the dynamic power is wasted on unproductive signal glitches. Reducing glitching reduces dynamic energy consumption. Design for low power 6 tap FIR filter has been presented in this work. The power reduction is achieved through the usage of a MAC unit inside the filters that reduce the total activity and therefore the dynamic power. The basic building blocks for the MAC unit are identified and each of the blocks is analyzed for its performance. Power is calculated for the blocks. 6 tap digital FIR filter has designed with enable to reduce the total power consumption based on pipelining and latch based clock gating techniques.

Active-HDL together with Altera Quartus II tool is used effectively to model dynamic transient signal activity and produce accurate power consumption estimation. In this work original FIR filter, latch based filter and pipelined filter are implemented in cyclone EP1C6Q240C7 FPGA. It is seen by above results that Latch based design can reduce the dynamic power consumption by 92% and pipelining reduces that up to 95%.

### REFERENCES

- 1. ECE 2610 Introduction to singal and system
- 2. Pipelined and parallel recursive and adaptive filter by kesheb k. parhi
- 3. Chi-Jui Chou, Satish Mohanakrishnan, Joseph B. Evans "FPG IMPLEMENTATION OF DIGITAL FILTERS" Proc. ICSPAT '93.
- 4. Shanthala S, S. Y. Kulkarni, "VLSI Design and Implementation of Low power MAC unit with Block Enabling Technique," Eurojournals Publishing Inc. 2009.
- 5. Julien Lamoureux and Wayne Luk" An Overview of Low-Power Techniques for Field-Programmable Gate Arrays" IEEE Publications, 2008.
- 6. Nathaniel Rollins and Michael J. Wirthlin," Reducing Energy in FPGA Multipliers Through Glitch Reduction"
- 7. Jitesh Shinde, Dr. S. S. Salankar" Clock Gating –A Power Optimizing Technique for VLSI Circuits"
- 8. J. Bhasker "VHDL Primer", Prentice-Hall, Inc, 1999.
- 9. Volnei A Pedroni "Circuit Design with VHDL", MIT Press, 2004.
- 10. Sanjiv Kumar Mangal, Rahul M. Badghare, Raghavendra B. Deshmukh, R. M. Patrikar' FPGA Implementation

- of Low Power Parallel Multiplier" IEEE Publications, 2007.
- 11. Suvarna Joshi, Bharati Ainapure" FPGA BASED FIR FILTER" International Journal of Engineering Science and Technology Vol. 2 (12), 2010, 7320-7323.
- 12. Ravinder Kaur, Ashish Raman, Hardev Singh and Jagjit Malhotra" Design and Implementation of High Speed IIR and FIR Filter using Pipelining" International Journal of Computer Theory and Engineering, Vol. 3, No. 2, April 2011 ISSN: 1793-8201.
- 13. O. Gustafsson, H. Johansson, and L. Wanhammar, "An MILP approach for the design of linear-phase FIR filters with minimum number of signed-power-of-two terms," in Proc. Eur. Conf. Circuit Theory Design.
- 14. H. Samueli, "An improved search algorithm for the design of multiplierles FIR filters with powers-of-two coefficients," IEEE Trans. Circuits.
- 15. Q. F. Zhao and Y. Tadokoro, "A simple design of FIR filters wit Power-of-two coefficients," IEEE Trans. Circuits Syst., vol. 35, no. 5
- 16. J. Yli-Kaakinen and T. Saramäki, "A systematic algorithm for the design of multiplierless FIR filters," in Proc. IEEE Int. Symp. Circuits.
- 17. Bahman Rashidi and Majid Pourormazd "Design and implementation of low power Digital FIR Filter based on low power multipliers and adders on Xilinx FPGA," IEEE Publications, 2011